Winsock
Winsock is a application programming interface (API) for
the windows environment. It is a modification of an earlier
API for the Unix environment and actually is an admirable job
of trying to fit a square peg into a round hole.
All of the references to 'Blocking', 'Non-blocking', WSAAsync...,
etc. come from this conversion process.
Basically, the Unix operating system does not allow application
programmers (Application programs are programs that ride on top of
the built-in operating system and are usually designed to 'do'
something. Systems programs are written to provide underlying services
to Application programs. Windows is a system program. A program you
might write, an Employee database , for example, is an Application.) to operate in something
called Interrupt Mode. They must do everything in Polled Mode.
Before we get to whatever the heck that means, a word about where I'm coming from:
I've got ove 20 years programming experience. Cyber systems to DEC minis to PC's.
Machine language, Assembler, C, C++, .....etc. DOS, OS, Unix......
So I guess I can poke a little fun at some institutional ideas and concepts.
You know, like Unix being the e.e.cummings of operating systems.
Interrupt vs polling and blocking vs non-blocking:
In the systems world programmers have to work with very fast CPU's talking to
very slow devices. An example is transmitting data over a phone line. You always have
to send a character and then wait until the device can accept another.
If you use polled mode you will have a transmit loop that sends, loops (waits)
until the device is ready, and then sends the next. It's this waiting that
causes a problem. You are tying up CPU time which could be used for something
else, like maybe another transmit routine. So the hardware guys came up with
something called Interrupts. Now you simply arm the device (telephone interface) to
send an interrupt whenever it is ready for another character, and then
go off and do something else. The device needs a character and sends an interrupt.
When the interrupt occurs, the interrupt system hardware interrupts your
program, sets bookmarks so you can get back, and starts running your interrupt routine.
This is a separate module which handles the feeding of another character to the device
and then tells the hardware it is finished. The hardware uses the bookmarks to
resume your main program at the point it was interrupted. Elegant no?
Unfortunately the applications guy in Unix can't do this (Unix?, the end-all be-all?).
Instead he has to poll. Of course, Unix being multi-tasking, lets him merely
write another routine for the othe process...and another...and another.
Now in Windows (poor old slow Windows) there is a thing called messages. Guess
what? These are like interrupts to your main program. Now all you have to do is
tell your device handler to send a message to your message handler and the start
your main routine. Whenever the device guy sends a message, Windows stops your
main and throws you into the message routine (an Event!). Your message routine
(you know, like a click event) processes the message and when it terminates control
goes back to your main at the point it was interrupted.
That's why Winsock has all these add-ons to the original Berkeley Sockets stuff
to get from polled to interrupt, being that polled in Windows has some very bad
side effects, mainly performance problems.
Blocking vs non-blocking
Basically, the non-blocking calls are calls that will cause a message (interrupt)
at some later time, and blocking calls are those that do something
and then poll until they get an answer and THEN they return.
An example are the DNS name calls. These are used to get an Internet address which
corresponds to a Domain name, or vice-versa. Out on the net are DNS name servers
whose job in life is to convert from one to the other. If you use the blocking
call to convert a Domain name to an Internet address (more on this later) then
Winsock calls out on the network to the Domain Name System to get the matching
Internet address. This takes time, especially if no match is found. Winsock will
not return to you until it gets the response or times out. So your program acts
like it is "hung". In VB you are supposed to use the DoEvents call to return control
to Windows. There is a similar call in C. Apparently most Winsock programmers forgot
to do this in their polling loop, another reason why your system seems to 'hang'.
This is supposed to go away in Win95, which doesn't need DoEvents as it is supposed to
interrupt any program periodically to check other programs.
Better to use the non-blocking call. Winsock starts the name lookup and returns
to your program. Later, on name resolution (don't you love this techie terminology?)
or time out, Winsock will "post" (send) a message to your message routine. Your
main program is interrupted wherever it is, and the message event routine is
started. It gets the message parameters, does whatever you want (like updating
an array of values relating to that name/address) and then terminates. Your
main resumes where it was interrupted as if nothing happened.
Congratulations, you now know from blocking and non-blocking calls. The non-blocking
(message using) calls are preceeded with the term 'WSAAsync'. (Async is another
painful way of trying to describe interrupt activity.).
Addresses, structures, pointers etc.:
Now we get to C. (it was inevitable). C was designed as a way to write system
programs that could be "portable", that is the same source code could be compiled
on different computers and the resulting executable code would magically work.
A GREAT idea, and it sometimes almost is true. Then applications programmers
(actually schools started teaching it) discovered that they could also
write their stuff in C, if only a leetle teeney bit was added. C mushroomed.
And now every body thinks that if it ain't written in C it ain't any good.
My own take is, portability not withstanding, if you need systems performance,
write it in assembler. Ever look at a reverse assembled C program? And, if you
want to develop an application program (one that people interact with somehow,
AND is usually written to make money...something all those college profs have
never thought about) then use the HIGHEST level language you can use.
Think about it, what is the definition of the BEST program? I like to think
of it as the one that is developed the LEAST amount of time, that WORKS, and,
VERY important, is MAINTAINABLE. Period. Like it or not programming is a way
to create something that ultimately generates bucks. (Most programming is commercial
in nature, not recreational).Please don't respond to differ...this is just my own
personal opinon.
C
OK, so now I got to bash C, now lets see some details:
C was an Assembly language substitute designed for portability. Portability
requires the code syntax to somehow not be concerned with the underlying
machine specifics......like memory addressing, because if you are going to
'port' the code to a different computer then, in all likleyhood, its hardare
characteristics are different than any other machine. (Now you know why the
PC/Intel x86 archetecture is so popular, the Hardware is back compatible. Also
now you see why non-Intel stuff like Power PC's and Mac's don't have all this software
written in C for Intel platfiorms. Write it in C and it's portable...yeah right.)
Memory
Computers have memory. Memory is a huge pile of buckets (locations), each with
a unique numerical address, just like houses. These buckets contain other
numbers (this is because computers are pretty stupid, they can only add and
shift numbers.) which is data or, guess what?...addresses. Huh?
Let's do that again. Any computer memory location contains a number. This number
can be data (number of employees), a DIRECT address (the address of the bucket
that contains the number of employees) or an INDIRECT address (the address of
the address of the bucket with the number of employees. And, this nested
indirect stuff can go on forever, depending on the twists and turns of the
programmer's mind who set up the program. Winsock does have some double and triple
nested indirect addresses.
This is what C guys are trying to say when they throw out the words "cast".
They are trying to eventually get to the number of employees, as we all are.
Ever read a C primer? They are full of little diagrams of eyeballs peeking
thru fingers at computer memory locations, or of little rulers and sticks
with memory locations stacked up. All trying to explain direct and indirect
memory addressing. And no wonder, C is to isolate the programmer from the
underlying hardware and then they have to explain......the underlying
hardware. First time I ever read K&R (the C bible, written by the guys that
invented it) they actually bragged that they had NEVER written anything in
Assembler. I almost fell out of my chair.
Anyway, Winsock is written in C and messes with data in C format. The
documentation shows data structures, passed parameters, etc. in C format. To
use Winsock we have to translate that into Visual Basic terms. Also, we have
another problem. If C was used to isolate the programmer from the hardware,
guess what an even higher language, like VB, does? It REALLY isolates you
from the hardware and goes to great lengths to NOT let you mess with
memory addressing. VB doesn't even have PEEK and POKE. (Although I must admit,
this is one way for a C guy to retaliate....words like PEEK and POKE? It's
embarrasing). Luckily, Windows has a ton of DLL's which you can call from VB.
There is everything there needed to talk to Winsock...that's why we're here.
So now we gotta call Winsock to do something. In most cases we have to provide
parameters and, after the call, we have to retrieve them. Unortunatlely Winsock
thinks it is talking to a C program and expects things in a format which VB
can't easily do.
(You know, a big reason I got started with this VB/Winsock thing was after
reading an Email from some Unix/C weenie who was pontificating to a beginning
VB programmer who had the temerity to ask the Big Kahuna if it was possible
to use Winsock directly from VB. His Ceeness replied that "due to the inherent
limitations of VB it is impossible". I hope he reads this.
And now for an advertisement: checkout www.wwwdev.com for VBServer. It's a Web server written entirely in Visual Basic and uses Winsock DLL calls only. No VBX's to get to Winsock. They definately are not needed and VBServer works just fine.
GetHostByName, an example.
GetHostByName is a Winsock call that gets a Domain name in 'char' format (a string)
and returns a 32 bit (VB long&) Internet address associated with the
passed Domain name.
First we have to get around the indirect addressing stuff, that is translate
that C 'far', 'pointer', '*' stuff into English and then into VB.
OK class, get out your books which have the Winsock call definitions. Let's
look at 'gethostbyname'. C calls are functions, they return a value (Well, VOIDs
are functions which don't return anything which is a sub but we don't have to
deal with that). The call format is X=somecall(A,B,C). You set up the parameters
A,B, and C, call 'somefunction, and X gets set with the return value. Pretty
simple. Except that the function may expect A to be the A data, B the address
of the B data, and C to be the address of the address of the C data. And maybe X
is actually the address of the address of....Get the picture?
Now for GetHostByName. Ignore the return value for now, that's a structure and is a
later subject.
Strings
The first parameter says 'const char FAR* name'. That's Sanskrit for the fact
that the parameter passed is the address of a string. Strings are different
in C and VB. VB stores a description of the string (like length). When VB gets
a string it knows how many characters to get because the descriptor tells it
how long the string is. C strings are null terminated, that is the last character
is followed by a byte whose value is zero. Boy does this cause grief for C
programmers if they mess up. If you are processing a string in C and have
forgotten the terminating NULL you will run right past the end and will keep getting
whatever data is in memory until you finally encounter a null byte.(Like beucoup
K later).
String conversion from C to VB
Luckily, (actually by design), VB will convert its strings to NULL terminated
when you call a DLL so you don't have to worry about fooling with VB string
descriptors and null-terminated strings. But you DO have to worry on some
calls that return strings or addresses of strings because ,remember, Winsock
thinks it is talking to a C program so will return C-type references (pointers
or memory addresses) to C strings. By the way C-talk for addresses is 'pointer'.
GetHostByName string parameters.
Back to 'const char FAR* name'.
'const'
The details of the string parameter are 'const' which is ...who knows? I
forgot, or maybe never knew. constant? Look at the call gethostname, it
doesn't have 'const', yet the parameter passed is IDENTICAL. Get used to it,
C is full of this type of stuff. Anyway, we ignore it.
'char'
Next is 'char' which means the variable is a bunch of characters (a C string
terminated by a null). We are going to pass a VB string and let VB convert it
C format.
'FAR*'
Then we get 'FAR*' which says the parameter is an address ('*') pointing to
the data and that the address is in 'FAR' format. Without going into a lot of
detail, a FAR address is 32 bits long in a PC (VB long& data type). In assembler
we use two 16 bit values for a 32 bit address, segment and offset. Any address
in a PC can be described this way. A group of 16 bit addresses from 0 to all ones
is 64K. Thus the total address space you can reference using the SAME segment
number is 64K. Sound familiar? Yep, it's the Intel 64k segment 'limitation'.
Well, not really a limitation, just change the segment number and get more.
This is how mainfarame/mini memory management works. You could design a new
computer that uses more address bits and see more than 64K, but then all those
memory address paths have to change, and finally, how many bits? No matter
how many you use you'll be accused at a later date of creating a memory
segment 'limitation'. (Just like now with the 68xxx stuff used in Mac's, which
use this 'flat' memory model.) Anyway, in Winsock (PC/Intel) all addresses
are 32 bits or FAR and thus fit into a VB type long& variable.
'name'
Finally 'name' stands for the name of the variable you will provide.
So a call to GetHostByName in VB looks like:
hostent&=gethostbyname(my_name$).
Structures
Now what about hostent, the variable that gets set by Winsock on return from
the call? This is where I spent a LOT of time, trying to figure out what all
those structures in the spec were for and how they really looked in memory.
Once you figure this out you've got it. Oh, yeah, structures. VB calls them
'user defined types'.
What you are really doing is describing a chunk of memory........
I got this here chunk and the first word (16 bits)is an integer%, the next
2 words (16 bits * 2 = 32 bits) is a long&, and the last 20 bytes (8 bits each)
are characters.
Byte data
VB will not let you direcly access bytes (PEEK and POKE, INP and OUT used to.)
But there is a technique which gets around this, although a little clumsy. (Why
can't Microsoft give us a byte variable?). What you do is define a fixed string
of x bytes long. Now VB will use these bytes as characters. But to get the actual
number (data) in the character position we have to use the VB ASC() function to
get it. So, for example, I use, on byte-oriented operation, a fixed-string variable
that I define as : abyte as string*1. Then, if for example I read a binary file,
one byte (GET #1,,abyte), and the data in the file at that byte position was, say,
99 decimal, the assignment somevariable%=asc(abyte), then somevariable% is now
set to 99. Cool....but clumsy.
The above example type defining the chunk of memory is:
type my_type an_integer as integer a_long as long twenny_bytes as string*20 end typeC does the same thing. The trick is to describe the memory layout using the VB syntax and then set the values and then call Winsock, providing the address of the structure(type).
Winsock returned structures (VB types)
Some Winsock calls let you PASS a structure that gets filled in by Winsock. These
are easy, as you tell Winsock where to put the returned data by passing it the
address of your structure in the call. The Winsock call 'connect' does this.
But GetHostByName does things a little differently. Rember that the returned parameter
was something called 'hostent&', a VB long& value. This is the address of the
hostent structure SOMEWHERE in Winsock's own memory space. What Winsock did was
get the data and put it into a chunk of Winsock's memory which is mapped like
the hostent structure and then returned the address of the chunk's (structure's) first
memory location. Frustrating, we can't get to it because when we DIMensioned
OUR hostent structure, VB used VB's memory space. Nooo problem, just copy from
the Winsock structure to the VB structure. How many bytes? Well how about using
the VB LEN() function on hostent? Use len(hostent) and you don't have to manually
count how many bytes. Once the data is copied, we can now get to it using our VB
structure tags.
Aaaaand Another reason to copy....the spec indicates that because Winsock
is busy doing its thing it may realocate the chunk of memory (reuse it), so if
we want the data we better copy it fast. Also, I've had experience with the
dreaded GPF which may have come from me trying to mess with memory either outside
mine and protected or maybe Winsock was trying to change it when I was accessing it.
At any rate, copy it first thing.
ALSO, more details. Remember that Winsock uses these structures to pass data and
ADDRESSES of data (or addresses of addresses of data). Where do you think these are? You got it, in
Winsock's memory area again. After a while you may figure out (there is a vague reference
to this in the spec) that Winsock tucks in the actual data right after the structure
it allocates. This means to be safe you need to define extra memory locations
right AFTER your structure. Then, when you do the copy the length will include the
extra stuff and it'll sit there in VB's memory space right after the VB type,
and the VB pointers will point to it. (Please, those of you who know about
Relative vs Absolute memory addresses be quiet.)
OK, now we know how to get the Winsock structure data into out VB type, now
we can access the data.......well, almost.
BYVal and BYReference
Remember memory buckets? The memory location can hold data, an address, an address of an addresss...ad nauseum. Well, VB is pretty simple and basically uses either data or address of data.
Thus we need to be able to define VB data as to what it is and VB only talks about
two types: data and address of data. When we call DLL's we need to know if the
DLL wants a data type or an address of data type. This is where BYVAL comes in.
BYVAL indicates that it is the data method. The absence of BYVAL is the address
of the data method. That was easy. (A special exception is string data. BYVal
is used for something entirely (kinda) different. BYVAL forces VB to PASS the
specified string as the address of a NULL terminated C string. How do we convert
this VB string to a C string? We don't, VB does it automatically. You must
always use BYVAL, as you can see in the calls in VB format.)
So, if the Winsock function has '*' then it wants or gives a pointer (address),
if not then it wants or gives data. If data is used then we use BYVal.
hostent
Now back to structures and indirect addressing. Check out the hostent
structure in C syntax:
struc hostent{ char FAR* h_name char FAR* FAR* h_aliases short h_addrtype short h_length char FAR*FAR* h_addr_list };h_name is a null terminated string. Remember, it is NOT the string, but the address of the string, and it'a a FAR address so it's 32 bits, which in VB is a LONG&.
1 contains 0 2 contains 0 FAR*FAR* 3 contains 9 ----> points to location 9 4 contains 0 5 contains 0 6 contains 9 FAR*FAR*FAR* 7 contains 3 ----> points to location 3 8 contains 0 FAR* 9 contains 73 <--- the data 10 contains 0Our data is 73. But we can refer to it by any of the associated addresses. If we say that our pointer is FAR*FAR*FAR* and that it is a 7 then we are saying that our pointer (7) points to another pointer (3) which points to the data (contents of 9). OK, so what? Well, if we can only get to data in VB directly, that is by using it's address then somehow we have to get to the number 9 which is the address of the data. Hmmm, if we had PEEK we could PEEK in 7. We would get 3. Then we PEEK in 3 and we get 9.....the address of the data! Oh, yeah, we don't have PEEK. But we do have (drumroll) hmemcpy, which, of all things is a C call in the Windows DLL (Cymbal).
hmemcpy
hmemcpy is a nifty little call that copies a block of memory from one location
to another. You just specify the 'from' address, the 'to' address, and how many
bytes. (Technical talk for the 'from' or 'source' address is 'Goesoutta' and the
destination or 'to' address is 'Goesinta').
Two tricks here...pay attention. One, the pointers are all
FAR or 32 bits which is 4 bytes. So our memory blocks for holding FAR addresses are 4 bytes each.
Two, and VERY important, we can force the way we pass the parameters to hmemcpy.
What? Easy, make the destination by reference and the source byvalue (ByVal).
Instant PEEK without the stigma of a silly word. In fact C guys will be
impressed...."Oh yeah, (sigh) I use hmemcpy"...wow.(ALWAYS speak in lower
case). Now to use gethostbyname you define the above C strucure in VB type
lingo as:
Type hostent_type h_name As Long h_aliases As Long h_addrtype As Integer h_length As Integer h_addr_list As Long End Type Global hostent as hostent_typehostent.h_addr_list points to...a list of addresses for this host. Don't worry, just take the first in the list. Do it like this:
'get host address by name 'returns address of 'winsock hostent structure ha& = gethostbyname(hostname$) 'copy winsock structure to vbserve structure hmemcpy hostent.h_name, ByVal ha&, Len(hostent) 'get address of list listaa& = hostent.h_addr_list hmemcpy lista&, ByVal listaa&, 4 'get first list entry hmemcpy internet_address&, ByVal lista&, 4Trust me, it works. And that is basically that.
Dot Addresses
Usually, people don't want to read the Internet address as a 32 bit number
because it's too big, and portions of mean different things. Once we have the
32 bit Internet Address we can translate it into 'dot address' format, which
is what humans like to see. There is a Winsock function which does this translation
although it really could be computed in code but why do it when Winsock will?
The usage format is:
inet_ntoa
This call will convert a passed 32 bit Internet Address into a string containing
the 'dot address'. Sounds simple? It is, but if you have been peeking at the spec
you noticed that the call passes a structure. Further, if you went to the structure
definitions you saw structure in_addr defined in some really strange looking C
code with the word 'UNION' buried in it:
struct in_addr{ union{ struct {u_char s_b1,s_b2,s_b3,s_b4;}S_un_b; struct {u_short s_w1,s_w2;}S_un_w; struct {u_long S_addr; }S_un; etc.,etc., and etc.Whatever in the world is that? Hey, C is sooo much more user-friendly than Assembler. Let's see if the VB equivalent makes more sense:
type in_addr_as_4bytes addr_byte1 as string * 1 addr_byte2 as string * 1 addr_byte3 as string * 1 addr_byte4 as string * 1 end type type in_addr_as_2words addr_word1 as integer addr_word2 as integer end type type in_addr_as_1long addr_long as long end typeAh-ha! This is only a way to get to the bytes in a long&, or the words% (integers) in a long&, or the long& itself. Why? I dunno. There is no code that I can find that uses it. So our call is really looking for a long&:
char FAR* PASCAL FAR inet_ntoa (struct in_addr in); is: Declare Function inet_ntoa Lib "winsock.dll" (ByVal iaddr As Long) As Long and gets used as: dota& = inet_ntoa&(internet_addr&) 'internet_addr& is our 32 bit Internet 'address from the last call. 'lstrcpy needs a blank VB target string 'later, later dotaddr$ = Space$(256) temp& = lstrcpy&(dotaddr$, dota&) 'Get rid of nulls copied from Winsock, you gotta write this. It just 'uses a for loop to scan the string byte by byte and replaces if it gets a match. server_dotaddr$ = replacechar(server_dotaddr$, Chr$(0), " ") 'And trim it server_dotaddr$ = Trim$(server_dotaddr$)Whoa, what happened? Well, we got a returned value 'dota&' which is a 32 bit pointer to a C string which contains the dot address. We want to copy it into a VB string. We FIRST set the VB string to 256 spaces so lpstrcpy has a bucket to put the C string into and then call it. Then, we get rid of any NULL's copied. And then we trim it. What the heck is lstrcpy? A good way to copy C strings into VB strings, that's what.
lstrcpy
Wow, another C function in the Windows DLL. What ammo for the next beer-bust
with a bunch of C guys. This function copies a C string from a 32 bit address
into a VB string. Well, into space ALLOCATED for a VB string. This includes
everything, including the terminating NULL, so we have to get rid of that.